Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MINOR: Double Quote CLASSPATH to prevent shell glob expansion. #8191

Merged
merged 1 commit into from
Feb 29, 2020
Merged

MINOR: Double Quote CLASSPATH to prevent shell glob expansion. #8191

merged 1 commit into from
Feb 29, 2020

Conversation

andrewegel
Copy link
Contributor

In the event that the CLASSPATH does not have an ending ":", the shell
can expand the CLASSPATH globs to be space-separated list of paths/jars,
which is not how the JVM CLI accepts arguments to -cp switch. So
double quote the variable to prevent pattern expansion, and pass the
glob pattern directly to the JVM.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

In the event that the CLASSPATH does not have an ending ":", the shell
can expand the CLASSPATH globs to be space-separated list of paths/jars,
which is not how the JVM CLI accepts arguments to -cp switch. So
double quote the variable to prevent pattern expansion, and pass the
glob pattern directly to the JVM.
@ijuma
Copy link
Member

ijuma commented Feb 28, 2020

Thanks for the PR. Can you include a description of how this was validated?

@andrewegel
Copy link
Contributor Author

andrewegel commented Feb 28, 2020

I'll attempt to highlight the problem, and the fix, and provide a further justification on the fix.

First I added an echo before the exec to highlight what the issue is:

[centos@ip-172-31-11-17 ~]$ tail -14 /bin/kafka-run-class
CLASSPATH=${CLASSPATH#:}
echo "CP = $CLASSPATH"

# If Cygwin is detected, classpath is converted to Windows format.
(( CYGWIN )) && CLASSPATH=$(cygpath --path --mixed "${CLASSPATH}")


# Launch mode
if [ "x$DAEMON_MODE" = "xtrue" ]; then
  nohup $JAVA $KAFKA_HEAP_OPTS $KAFKA_JVM_PERFORMANCE_OPTS $KAFKA_GC_LOG_OPTS $KAFKA_JMX_OPTS $KAFKA_LOG4J_OPTS -cp $CLASSPATH $KAFKA_OPTS "$@" > "$CONSOLE_OUTPUT_FILE" 2>&1 < /dev/null &
else
  echo $JAVA $KAFKA_HEAP_OPTS $KAFKA_JVM_PERFORMANCE_OPTS $KAFKA_GC_LOG_OPTS $KAFKA_JMX_OPTS $KAFKA_LOG4J_OPTS -cp $CLASSPATH $KAFKA_OPTS "$@"
  exec $JAVA $KAFKA_HEAP_OPTS $KAFKA_JVM_PERFORMANCE_OPTS $KAFKA_GC_LOG_OPTS $KAFKA_JMX_OPTS $KAFKA_LOG4J_OPTS -cp $CLASSPATH $KAFKA_OPTS "$@"
fi

Now we run it:

[centos@ip-172-31-11-17 ~]$ sudo /bin/zookeeper-server-start /etc/kafka/zookeeper.properties
CP = /bin/../share/java/kafka/*
java -Xmx512M -Xms512M -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -XX:MaxInlineLevel=15 -Djava.awt.headless=true -Xloggc:/var/log/kafka/zookeeper-gc.log -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=100M -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dkafka.logs.dir=/var/log/kafka -Dlog4j.configuration=file:/etc/kafka/log4j.properties -cp /bin/../share/java/kafka/activation-1.1.1.jar /bin/../share/java/kafka/aopalliance-repackaged-2.5.0.jar [... more jars ...] org.apache.zookeeper.server.quorum.QuorumPeerMain /etc/kafka/zookeeper.properties
Error: Could not find or load main class .bin....share.java.kafka.aopalliance-repackaged-2.5.0.jar

It comes down to the fact that CLASSPATH="/bin/../share/java/kafka/*" with no other : appended on it. This can vary across all deployments this code, so there could be anything in this script that appends more class paths to this variable based on the existence of other things, take this for example:

for file in "$clients_lib_dir"/kafka-clients*.jar;
do
  if should_include_file "$file"; then
    CLASSPATH="$CLASSPATH":"$file"
  fi
done

Any of these blocks may be appending class paths. Which is me trying to address the reason why this isn't an issue currently on AK-kafka.

The main point is, there can be cases/deployments where there the path is no longer appending paths, and just contains CLASSPATH="/bin/../share/java/kafka/*" - Why is this a problem? Because the shell (bash) expands this glob to path locations. If there was something else appended (like CLASSPATH="/bin/../share/java/kafka/*:/path/to/other.jar") that prevents shell glob expansion, because nothing can match a file path with : in it.

I now make the change by quoting $CLASSPATH:

[centos@ip-172-31-11-17 ~]$ tail -14 /bin/kafka-run-class
CLASSPATH=${CLASSPATH#:}
echo "CP = $CLASSPATH"

# If Cygwin is detected, classpath is converted to Windows format.
(( CYGWIN )) && CLASSPATH=$(cygpath --path --mixed "${CLASSPATH}")


# Launch mode
if [ "x$DAEMON_MODE" = "xtrue" ]; then
  nohup $JAVA $KAFKA_HEAP_OPTS $KAFKA_JVM_PERFORMANCE_OPTS $KAFKA_GC_LOG_OPTS $KAFKA_JMX_OPTS $KAFKA_LOG4J_OPTS -cp "$CLASSPATH" $KAFKA_OPTS "$@" > "$CONSOLE_OUTPUT_FILE" 2>&1 < /dev/null &
else
  echo $JAVA $KAFKA_HEAP_OPTS $KAFKA_JVM_PERFORMANCE_OPTS $KAFKA_GC_LOG_OPTS $KAFKA_JMX_OPTS $KAFKA_LOG4J_OPTS -cp "$CLASSPATH" $KAFKA_OPTS "$@"
  exec $JAVA $KAFKA_HEAP_OPTS $KAFKA_JVM_PERFORMANCE_OPTS $KAFKA_GC_LOG_OPTS $KAFKA_JMX_OPTS $KAFKA_LOG4J_OPTS -cp "$CLASSPATH" $KAFKA_OPTS "$@"
fi

Class path is now no longer expanded (-cp /bin/../share/java/kafka/*)

[centos@ip-172-31-11-17 ~]$ sudo /bin/zookeeper-server-start /etc/kafka/zookeeper.properties
CP = /bin/../share/java/kafka/*
java -Xmx512M -Xms512M -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -XX:MaxInlineLevel=15 -Djava.awt.headless=true -Xloggc:/var/log/kafka/zookeeper-gc.log -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=100M -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dkafka.logs.dir=/var/log/kafka -Dlog4j.configuration=file:/etc/kafka/log4j.properties -cp /bin/../share/java/kafka/* org.apache.zookeeper.server.quorum.QuorumPeerMain /etc/kafka/zookeeper.properties
[2020-02-28 19:15:29,952] INFO Reading configuration from: /etc/kafka/zookeeper.properties (org.apache.zookeeper.server.quorum.QuorumPeerConfig)
[2020-02-28 19:15:29,961] INFO clientPortAddress is 0.0.0.0:2181 (org.apache.zookeeper.server.quorum.QuorumPeerConfig)
[2020-02-28 19:15:29,961] INFO secureClientPort is not set (org.apache.zookeeper.server.quorum.QuorumPeerConfig)
[2020-02-28 19:15:29,963] INFO autopurge.snapRetainCount set to 3 (org.apache.zookeeper.server.DatadirCleanupManager)
...
[2020-02-28 19:15:29,994] INFO Server environment:java.class.path=/bin/../share/java/kafka/jersey-client-2.28.jar:/bin/../share/java/kafka/metrics-5.0.0-ce-SNAPSHOT.jar

The JVM does the glob expansion itself from -cp /bin/../share/java/kafka/* - you can see that in the log line above.

Further justification: This is a common shell scripting bug folks run into, and hence why the shellcheck tool has an entire lint/page about this very thing:

https://github.com/koalaman/shellcheck/wiki/SC2086

Copy link
Member

@ijuma ijuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. LGTM.

@ijuma
Copy link
Member

ijuma commented Feb 29, 2020

Verified that kafka-server-start and zookeeper-server-start still work with this change, merged to trunk.

@ijuma ijuma merged commit 0d83894 into apache:trunk Feb 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants